In [ ]:
    
# Convention for import to get shortened namespace
import numpy as np
    
In [ ]:
    
# Create an array for testing
a = np.arange(12).reshape(3, 4)
    
In [ ]:
    
a
    
Indexing in Python is 0-based, so the command below looks for the 2nd item along the first dimension (row) and the 3rd along the second dimension (column).
In [ ]:
    
a[1, 2]
    
Can also just index on one dimension
In [ ]:
    
a[2]
    
Negative indices are also allowed, which permit indexing relative to the end of the array.
In [ ]:
    
a[0, -1]
    
Slicing syntax is written as start:stop[:step], where all numbers are optional.
It should be noted that end represents one past the last item; one can also think of it as a half open interval: [start, end)
In [ ]:
    
# Get the 2nd and 3rd rows
a[1:3]
    
In [ ]:
    
# All rows and 3rd column
a[:, 2]
    
In [ ]:
    
# ... can be used to replace one or more full slices
a[..., 2]
    
In [ ]:
    
# Slice every other row
a[::2]
    
In [ ]:
    
# You can also slice using negative indices
a[:, :-1]
    
In [ ]:
    
data = [1, 3, 5, 7, 9, 11]
out = []
# Look carefully at the loop. Think carefully about the sequence of values
# that data[i] takes--is there some way to get those values as a numpy slice?
# What about for data[i + 1]?
for i in range(len(data) - 1):
    out.append((data[i] + data[i + 1]) / 2)
print(out)
    
data = np.array([1, 3, 5, 7, 9, 11])
out = (data[:-1] + data[1:]) / 2
print(out)
data = np.array([1, 3, 5, 7, 9, 11])
out = (data[2:] + data[1:-1] + data[:-2]) / 3
print(out)
In [ ]:
    
data = np.arange(12).reshape(3, 4)
# total = ?
    
print(data[0] + data[1] + data[2])
\# Or we can use numpy's sum and use the "axis" argument
print(np.sum(data, axis=0))
The solution to the last exercise introduces an important concept when working with NumPy: the axis. This indicates the particular dimension along which a function should operate (provided the function does something taking multiple values and converts to a single value).
Let's look at a concrete example with sum:
In [ ]:
    
a
    
In [ ]:
    
# This calculates the total of all values in the array
np.sum(a)
    
In [ ]:
    
# Keep this in mind:
a.shape
    
In [ ]:
    
# Instead, take the sum across the rows:
np.sum(a, axis=0)
    
In [ ]:
    
# Or do the same and take the some across columns:
np.sum(a, axis=1)
    
In [ ]:
    
# Synthetic data
temp = np.random.randn(100, 50)
u = np.random.randn(100, 50)
v = np.random.randn(100, 50)
# Calculate the gradient components
gradx, grady = np.gradient(temp)
# Turn into an array of vectors:
# axis 0 is x position
# axis 1 is y position
# axis 2 is the vector components
grad_vec = np.dstack([gradx, grady])
print(grad_vec.shape)
# Turn wind components into vector
wind_vec = np.dstack([u, v])
# Calculate advection, the dot product of wind and the negative of gradient
# DON'T USE NUMPY.DOT (doesn't work). Multiply and add.
    
advec = (wind_vec * -grad_vec).sum(axis=-1)
print(advec.shape)
In [ ]:
    
# Create some synthetic data representing temperature and wind speed data
np.random.seed(19990503)  # Make sure we all have the same data
temp = (20 * np.cos(np.linspace(0, 2 * np.pi, 100)) +
        50 + 2 * np.random.randn(100))
spd = (np.abs(10 * np.sin(np.linspace(0, 2 * np.pi, 100)) +
              10 + 5 * np.random.randn(100)))
    
In [ ]:
    
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot(temp, 'tab:red')
plt.plot(spd, 'tab:blue');
    
By doing a comparision between a NumPy array and a value, we get an array of values representing the results of the comparison between each element and the value
In [ ]:
    
temp > 45
    
We can take the resulting array and use this to index into the NumPy array and retrieve the values where the result was true
In [ ]:
    
print(temp[temp > 45])
    
So long as the size of the boolean array matches the data, the boolean array can come from anywhere
In [ ]:
    
print(temp[spd > 10])
    
In [ ]:
    
# Make a copy so we don't modify the original data
temp2 = temp.copy()
# Replace all places where spd is <10 with NaN (not a number) so matplotlib skips it
temp2[spd < 10] = np.nan
plt.plot(temp2, 'tab:red')
    
Can also combine multiple boolean arrays using the syntax for bitwise operations. MUST HAVE PARENTHESES due to operator precedence.
In [ ]:
    
print(temp[(temp < 45) & (spd > 10)])
    
In [ ]:
    
# Here's the "data"
np.random.seed(19990503)  # Make sure we all have the same data
temp = (20 * np.cos(np.linspace(0, 2 * np.pi, 100)) +
        80 + 2 * np.random.randn(100))
rh = (np.abs(20 * np.cos(np.linspace(0, 4 * np.pi, 100)) +
              50 + 5 * np.random.randn(100)))
# Create a mask for the two conditions described above
# good_heat_index = 
# Use this mask to grab the temperature and relative humidity values that together
# will give good heat index values
# temp[] ?
# BONUS POINTS: Plot only the data where heat index is defined by
# inverting the mask (using `~mask`) and setting invalid values to np.nan
    
import numpy as np
\# Here's the "data"
np.random.seed(19990503)  # Make sure we all have the same data
temp = (20 * np.cos(np.linspace(0, 2 * np.pi, 100)) +
        80 + 2 * np.random.randn(100))
rh = (np.abs(20 * np.cos(np.linspace(0, 4 * np.pi, 100)) +
              50 + 5 * np.random.randn(100)))
\# Create a mask for the two conditions described above
good_heat_index = (temp >= 80) & (rh >= 0.4)
\# Use this mask to grab the temperature and relative humidity values that together
\# will give good heat index values
print(temp[good_heat_index]) 
\# BONUS POINTS: Plot only the data where heat index is defined by
\# inverting the mask (using `~mask`) and setting invalid values to np.nan
temp[~good_heat_index] = np.nan
plt.plot(temp, 'tab:red')
In [ ]:
    
print(temp[0])
    
We can also extract the first, fifth, and tenth elements:
In [ ]:
    
print(temp[[0, 4, 9]])
    
One of the ways this comes into play is trying to sort numpy arrays using argsort. This function returns the indices of the array that give the items in sorted order. So for our temp "data":
In [ ]:
    
inds = np.argsort(temp)
print(inds)
    
We can use this array of indices to pass into temp to get it in sorted order:
In [ ]:
    
print(temp[inds])
    
Or we can slice inds to only give the 10 highest temperatures:
In [ ]:
    
ten_highest = inds[-10:]
print(temp[ten_highest])
    
There are other numpy arg functions that return indices for operating:
In [ ]:
    
np.*arg*?